2. Field1 means and variances: space, time, spacetime - uwnd 1000 is my Field1, SLP is my Field2
The GRAND (space+time) mean of my x and y: -1.7987 m/s and 1010.97 hPa
The GRAND (space+time) varianceof my x and y: 6.0236 m^2/s^2 and 2.3720 hPa^2
The GRAND standard deviations are:2.4543 m/s and 1.5402 hPa
The SPATIAL variance of my TIME MEAN longitude section is: 4.5413 m^2/s^2 and 1.1617 hPa^2
The TEMPORAL variance of my LONGITUDE MEAN time series is: 0.0711 m^2/s^2 and 0.5488 hPa^2
3. (the assignment part)
Confirm that the time mean of the anomalies as defined above is 0. Yes, xprime_bart=0.
Is the spatial mean of the anomalies (as defined above) 0? No, xprime_barxy is not 0.
Is it the same as the time series of the spatial mean of the raw data? Or is it a new thing?
As you can see from the above plots, the full field and anomaly field share many of the more jagged shifts in the series. However, these impacts are clearly smaller in amplitude than the full field. The difference between them shows the periodic change, which has a much larger amplitude signal. Since the difference is periodic, it shows that the two fields are related, but it's not 100% the same. Thus the anomaly field adds on the smaller wobbles. The story is the same for U-wind 1000 or SLP.
My CLIMATOLOGICAL ANNUAL CYCLE have variance:
Climx Variance U-wind 1000 - 5.1552 m^2/s^2
Climy Variance SLP - 1.8683 hPa^2
My INTERANNUAL ANOMALY ARRAYS have variance:
Anomx Variance U-wind 1000 - 0.8684 m^2/s^2
Anomy Variance SLP - 0.5037 hPa^2
Fill out a variance decomposition table for field 1: feel free to add columns if you can define other parts.
U-wind 1000
SLP
a) total variance of x
6.0236
2.3720
b) purely spatial (variance of TIME mean at each lon)
4.5413
1.1617
c) variance of (x minus its TIME mean at each lon)
1.4823
1.2103
d) purely temporal (variance of LON mean at each time)
0.0711
0.5488
e) variance of (x minus its LON mean at each time)
4. Further decomposition of anomx by scale (using rebinning).
What space and time scales (units: degrees and months) have the most variance in your anomx field?
For both U-wind 1000 and SLP the largest variance is in the bottom right corner (right figures) for the variance distribution by scale reduction factor. This implies that variance occurs largely in shorter time scales over a larger spatial region. As can be seen in the anomaly figures from above. The impact of ENSO is clearly visible in U-wind and SLP as dominating the spatial scale (traveling across the Pacific) and in relatively short time scales (compared to the spatial extent). Thus rebinning on longer time scales will remove the variance much more quickly than rebinning by longer longitude scales.
5. Scatter plot, correlation and covariance, regression-explained variance
Based on your data fields (which you've seen pictures of), make subsets of your 2 variables x and y and make a scatter plot of these showing the strongest (positive or negative) correlation of one field with the other you can find. The subset might simply be all (x,t) values if your fields are very similar (olr, precip), or maybe the 240 time values at one longitude, or 144 longitudinal values in the time mean, or time series at different longitudes if some variability is offset in your two fields (like pressure and wind).
To find the two subsets was rather tricky for these two fields (u-wind 1000 and SLP) as there is a longitude offset for events. After some looping and trial and error, the best correlation was found between longitude 122.5 (for u-wind) and longitude 150 (for SLP) and time. This pattern can be seen in the raw data somewhat (for the anomaly fields). The scatterplot is shown below.
Now consider the covariance and correlation of the two subset arrays entering your scatterplot.
What is the correlation coefficient corresponding to this scatter plot? rho= -0.7203
What are the standard deviations of your two data subsets? xstd=0.9919 (uwind), ystd=1.0899 (SLP)
What fraction of the variance of y can be 'explained' by linear regression on x (y = mx + b)? How does this relate to rho? How much y variance is explained? (variance: with units of y squared) What is m? Hint: these are simple questions: use the math formula, not a computer code (Hsieh section 1.4.2, Eq. 1.33).
In the equation y=mx+b, m is the regression coefficient and is equal to rho*std(y)/std(x). To find the amount of variance explained by m, one can change the formula into std(e)^2=std(y)^2*(1-rho^2), where (1-rho^2) is the fraction of variance not accounted for by the linear regression. Thus the fraction that is explained is simply rho^2.
Variance explained by regression = 51.89%
m = -0.7915
What fraction of the variance of x can be 'explained' by linear regression on variable y? (x = nx + a)? How does this relate to rho? What is n? Hint: these are simple questions, use the math formula not computer code.
Same argument as before, just flip x and y. Variance explained by regression is the same as above.
n = rho*std(x)/std(y) = -0.6556
Now add uncorrelated (random) noise with variance 1 to one of your variables. This might be like observation error. noisey = y + random('Normal',0,1,size(y))
How did the variance of y change when this noise was added? The original variance of the subset of y is 1.1880. This greatly increases when random noise is added to 2.5712.
How did the correlation change? The correlation coefficient dropped from -0.7203 to -0.5035, so less correlation.
How do these changes affect the regression of y on x? How much (y+noise) variance is explained by linear regression on x? What is the new value of m in the new (noisey = mx + b) regression?
The same argument as above still holds, just with the new noisey variable in place of y.
Variance explained by regression is now = 25.35% (reduced by half!)
m_new = -0.8139
6. Lagged correlation, covariance, and cross-covariance: questions
Show the zero-lag spatial covariance and correlation structures for your primary field. Interpret the results.
As can be seen with these plots, a majority of the covariance occurs along two longitudes (approximately 75 and 175), and are negatively correlated with each other. Thus when the winds are strong at 75, they are weak at 175. These are also the two peaks in standard deviation. The correlation shows this slightly, but it is much clearer (at least for me) in the covariance plot.
Show longitude-lag sections of the covariance or correlation of this field, for a base point at some longitude of interest.
The longitude I picked to look at the longitude-lag sections is the along the value of highest covariance (longitude 175).
This nicely shows the zero-lag negative covariance with longitude 75 and shows a negative correlation related to events leading up to the strongest value of covariance.
Intepret the results in terms of the characteristic space and time scales of your anomalies. Can you see these characteristic scales in your original raw data?
You can clearly see this pattern in the anomalies of U-wind 1000 plot from above. There are two areas of larger (spatially speaking) anomalies centered around longitudes 75 and 175. The anomalies show that they occur around the same time and are opposite in sign (negatively correlated).
Share a longitude-lag slice of your lagged co-variance matrix for your TWO fields. Label it, interpret it.
For this plot, I picked longitude 122.5 (as the important longitude for u-wind from the subset in section 5). This plot is similar to the figure above (for just u-wind) and shows a positive covariance at zero-lag longitude 175 and a negative, but weaker, covariance at zero-lag 75. The main difference is that signal is weaker in covariance than above.
U-wind 1000 and SLP
Angela Colbert2. Field1 means and variances: space, time, spacetime - uwnd 1000 is my Field1, SLP is my Field2
3. (the assignment part)
As you can see from the above plots, the full field and anomaly field share many of the more jagged shifts in the series. However, these impacts are clearly smaller in amplitude than the full field. The difference between them shows the periodic change, which has a much larger amplitude signal. Since the difference is periodic, it shows that the two fields are related, but it's not 100% the same. Thus the anomaly field adds on the smaller wobbles. The story is the same for U-wind 1000 or SLP.
4. Further decomposition of anomx by scale (using rebinning).
For both U-wind 1000 and SLP the largest variance is in the bottom right corner (right figures) for the variance distribution by scale reduction factor. This implies that variance occurs largely in shorter time scales over a larger spatial region. As can be seen in the anomaly figures from above. The impact of ENSO is clearly visible in U-wind and SLP as dominating the spatial scale (traveling across the Pacific) and in relatively short time scales (compared to the spatial extent). Thus rebinning on longer time scales will remove the variance much more quickly than rebinning by longer longitude scales.
5. Scatter plot, correlation and covariance, regression-explained variance
To find the two subsets was rather tricky for these two fields (u-wind 1000 and SLP) as there is a longitude offset for events. After some looping and trial and error, the best correlation was found between longitude 122.5 (for u-wind) and longitude 150 (for SLP) and time. This pattern can be seen in the raw data somewhat (for the anomaly fields). The scatterplot is shown below.
6. Lagged correlation, covariance, and cross-covariance: questions